GENSCAN 1.0 Date run: 21-Oct-105 Time: 08:17:24 Sequence Arabidopsis : 102580 bp : 37.96% C+G : Isochore 1 ( 0 - 43 C+G%) Parameter matrix: Arabidopsis.smat Predicted genes/exons: Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ 1.01 Intr + 867 937 71 1 2 42 91 63 0.311 4.98 1.02 Intr + 1560 1717 158 2 2 -72 88 157 0.615 2.59 1.03 Intr + 2167 2246 80 1 2 119 85 68 0.998 12.88 1.04 Intr + 2688 3020 333 1 0 62 59 273 0.464 21.42 1.05 Intr + 4764 4944 181 1 1 -17 64 131 0.580 3.20 1.06 Intr + 5280 5310 31 0 1 122 28 29 0.535 2.51 1.07 Intr + 5592 5738 147 2 0 31 80 116 0.403 9.61 1.08 Intr + 7268 7338 71 1 2 39 116 107 0.895 10.66 1.09 Intr + 7427 7464 38 2 2 122 59 3 0.966 2.79 1.10 Intr + 7543 7653 111 2 0 31 88 26 0.668 1.33 1.11 Intr + 7978 8099 122 2 2 37 98 87 0.396 8.89 1.12 Term + 8187 8291 105 2 0 53 38 160 0.990 9.93 1.13 PlyA + 8393 8398 6 -3.44 2.08 PlyA - 8511 8506 6 1.05 2.07 Term - 10041 9015 1027 2 1 69 29 793 0.958 67.10 2.06 Intr - 10317 10121 197 0 2 -38 38 303 0.969 14.99 2.05 Intr - 11154 11010 145 2 1 55 94 228 0.999 24.46 2.04 Intr - 11285 11221 65 2 2 84 67 55 0.983 4.60 2.03 Intr - 11481 11380 102 0 0 52 88 63 0.945 7.15 2.02 Intr - 12273 11965 309 0 0 20 68 372 0.987 28.98 2.01 Init - 13665 12991 675 0 0 18 39 613 0.417 49.61 2.00 Prom - 14055 14016 40 -8.25 3.00 Prom + 14117 14156 40 -2.95 3.01 Init + 15442 15471 30 0 0 81 88 4 0.205 4.60 3.02 Intr + 16016 16173 158 1 2 -72 88 157 0.516 2.59 3.03 Intr + 16623 16702 80 0 2 119 85 68 0.999 12.88 3.04 Intr + 17144 17476 333 0 0 62 59 273 0.464 21.42 3.05 Intr + 19220 19400 181 0 1 -17 64 131 0.580 3.20 3.06 Intr + 19736 19766 31 2 1 123 28 29 0.553 2.61 3.07 Intr + 20048 20194 147 1 0 31 80 116 0.414 9.61 3.08 Intr + 21723 21793 71 2 2 39 116 107 0.895 10.66 3.09 Intr + 21882 21919 38 0 2 122 59 3 0.966 2.79 3.10 Intr + 21998 22108 111 0 0 31 88 26 0.668 1.33 3.11 Intr + 22433 22554 122 0 2 37 98 87 0.396 8.89 3.12 Term + 22642 22746 105 0 0 53 38 160 0.990 9.93 3.13 PlyA + 22848 22853 6 -3.44 4.08 PlyA - 22966 22961 6 1.05 4.07 Term - 24496 23470 1027 0 1 69 29 793 0.958 67.10 4.06 Intr - 24772 24576 197 1 2 -38 38 303 0.969 14.99 4.05 Intr - 25609 25465 145 0 1 55 94 228 0.999 24.46 4.04 Intr - 25740 25676 65 0 2 84 67 55 0.983 4.60 4.03 Intr - 25936 25835 102 1 0 52 88 63 0.945 7.15 4.02 Intr - 26728 26420 309 1 0 20 68 372 0.987 28.98 4.01 Init - 28120 27446 675 1 0 18 39 613 0.417 49.61 4.00 Prom - 28510 28471 40 -8.25 5.00 Prom + 28572 28611 40 -2.95 5.01 Init + 29897 29926 30 1 0 81 88 4 0.230 4.60 5.02 Intr + 30445 30602 158 0 2 -72 88 157 0.651 2.59 5.03 Intr + 31052 31131 80 2 2 119 85 68 0.997 12.88 5.04 Intr + 31573 31892 320 2 2 62 56 259 0.249 19.85 5.05 Term + 33629 34243 615 1 0 3 29 396 0.128 23.67 5.06 PlyA + 34617 34622 6 1.05 6.04 PlyA - 35000 34995 6 1.05 6.03 Term - 35858 35839 20 0 2 94 48 20 0.784 1.10 6.02 Intr - 37125 36649 477 1 0 39 88 295 0.327 21.69 6.01 Init - 37327 37321 7 1 1 27 82 0 0.524 -0.48 6.00 Prom - 37841 37802 40 -7.85 7.00 Prom + 38326 38365 40 -9.55 7.01 Init + 38451 38572 122 2 2 66 64 72 0.698 7.31 7.02 Intr + 38908 38938 31 1 1 123 28 29 0.562 2.61 7.03 Intr + 39220 39366 147 0 0 31 80 116 0.396 9.61 7.04 Intr + 40895 40965 71 1 2 39 116 107 0.894 10.66 7.05 Intr + 41054 41091 38 2 2 122 59 3 0.966 2.79 7.06 Intr + 41170 41280 111 2 0 31 88 26 0.668 1.33 7.07 Intr + 41605 41726 122 2 2 37 98 87 0.396 8.89 7.08 Term + 41814 41918 105 2 0 53 38 160 0.990 9.93 7.09 PlyA + 42020 42025 6 -3.44 8.08 PlyA - 42138 42133 6 1.05 8.07 Term - 43668 42642 1027 2 1 69 29 793 0.958 67.10 8.06 Intr - 43944 43748 197 0 2 -38 38 303 0.969 14.99 8.05 Intr - 44781 44637 145 2 1 55 94 228 0.999 24.46 8.04 Intr - 44912 44848 65 2 2 84 67 55 0.983 4.60 8.03 Intr - 45108 45007 102 0 0 52 88 63 0.945 7.15 8.02 Intr - 45900 45592 309 0 0 20 68 372 0.987 28.98 8.01 Init - 47292 46618 675 0 0 18 39 613 0.417 49.61 8.00 Prom - 47682 47643 40 -8.25 9.00 Prom + 47744 47783 40 -2.95 9.01 Init + 48617 48749 133 1 1 22 46 50 0.671 -0.45 9.02 Intr + 48872 48998 127 0 1 42 41 124 0.606 6.92 9.03 Intr + 50172 50251 80 0 2 119 85 68 0.994 12.88 9.04 Intr + 50693 51025 333 0 0 62 59 273 0.473 21.42 9.05 Intr + 52771 52951 181 2 1 -17 64 131 0.593 3.20 9.06 Intr + 53287 53317 31 1 1 122 28 29 0.536 2.51 9.07 Intr + 53599 53745 147 0 0 31 80 116 0.402 9.61 9.08 Intr + 55275 55345 71 2 2 39 116 107 0.895 10.66 9.09 Intr + 55434 55471 38 0 2 122 59 3 0.966 2.79 9.10 Intr + 55550 55660 111 0 0 31 88 26 0.668 1.33 9.11 Intr + 55985 56106 122 0 2 37 98 87 0.396 8.89 9.12 Term + 56194 56298 105 0 0 53 38 160 0.990 9.93 9.13 PlyA + 56400 56405 6 -3.44 10.08 PlyA - 56518 56513 6 1.05 10.07 Term - 58048 57022 1027 0 1 69 29 793 0.958 67.10 10.06 Intr - 58324 58128 197 1 2 -38 38 303 0.969 14.99 10.05 Intr - 59161 59017 145 0 1 55 94 228 0.999 24.46 10.04 Intr - 59292 59228 65 0 2 84 67 55 0.983 4.60 10.03 Intr - 59488 59387 102 1 0 52 88 63 0.945 7.15 10.02 Intr - 60280 59972 309 1 0 20 68 372 0.987 28.98 10.01 Init - 61672 60998 675 1 0 18 39 613 0.436 49.61 10.00 Prom - 62062 62023 40 -8.25 11.00 Prom + 62124 62163 40 -2.95 11.01 Init + 64806 65070 265 2 1 64 25 440 0.697 37.52 11.02 Intr + 65843 66045 203 0 2 77 61 263 0.370 25.58 11.03 Intr + 66144 66314 171 2 0 76 23 284 0.999 24.92 11.04 Intr + 66399 66518 120 2 0 99 50 149 0.830 16.97 11.05 Intr + 66643 66789 147 0 0 104 13 174 0.897 16.01 11.06 Intr + 66904 66975 72 0 0 98 26 109 0.998 9.38 11.07 Intr + 67064 67603 540 1 0 52 94 513 0.999 45.47 11.08 Term + 67714 67863 150 0 0 44 43 128 0.855 5.83 11.09 PlyA + 68077 68082 6 -3.24 12.06 PlyA - 68437 68432 6 1.05 12.05 Term - 68905 68615 291 1 0 -46 42 314 0.991 12.76 12.04 Intr - 69122 68994 129 2 0 -13 84 109 0.862 5.37 12.03 Intr - 69341 69197 145 1 1 16 50 92 0.897 2.56 12.02 Intr - 69869 69701 169 0 1 11 78 227 0.399 17.08 12.01 Init - 70931 70898 34 2 1 104 63 44 0.564 8.58 12.00 Prom - 71375 71336 40 -7.65 13.33 PlyA - 71515 71510 6 1.05 13.32 Term - 72936 72813 124 2 1 37 45 155 0.464 7.98 13.31 Intr - 73347 73037 311 0 2 69 -23 251 0.605 11.29 13.30 Intr - 74396 73588 809 0 2 -19 75 1080 0.501 91.23 13.29 Intr - 74684 74477 208 2 1 27 42 194 0.604 11.43 13.28 Intr - 75537 75145 393 0 0 73 64 363 0.998 31.32 13.27 Intr - 75688 75608 81 1 0 81 87 89 0.999 12.02 13.26 Intr - 76271 76167 105 2 0 70 42 113 0.996 9.39 13.25 Intr - 76981 76375 607 0 1 99 88 437 0.991 41.32 13.24 Intr - 77335 77169 167 1 2 39 74 213 0.998 17.84 13.23 Intr - 77646 77514 133 2 1 37 42 145 0.930 9.53 13.22 Intr - 78053 77950 104 2 2 116 84 109 0.999 16.35 13.21 Intr - 78592 78464 129 1 0 23 76 144 0.902 11.67 13.20 Intr - 78765 78669 97 2 1 92 101 15 0.992 7.29 13.19 Intr - 79542 79415 128 0 2 36 42 83 0.996 2.06 13.18 Intr - 80078 80019 60 2 0 103 69 67 0.992 9.31 13.17 Intr - 80328 80167 162 0 0 -9 43 155 0.506 5.55 13.16 Intr - 80456 80376 81 2 0 65 75 97 0.997 10.02 13.15 Intr - 80766 80563 204 0 0 29 -12 162 0.479 3.77 13.14 Intr - 81049 80845 205 0 1 93 91 135 0.999 17.58 13.13 Intr - 81284 81204 81 1 0 107 78 -3 0.914 3.43 13.12 Intr - 81892 81699 194 1 2 71 56 269 0.999 24.37 13.11 Intr - 82281 82206 76 2 1 52 77 80 0.996 6.90 13.10 Intr - 83076 82943 134 0 2 23 84 88 0.917 5.42 13.09 Intr - 83386 83339 48 1 0 84 87 52 0.995 7.66 13.08 Intr - 83959 83879 81 1 0 76 18 91 0.948 4.82 13.07 Intr - 84228 84166 63 0 0 47 78 39 0.773 1.80 13.06 Intr - 84854 84795 60 2 0 74 82 31 0.943 4.11 13.05 Intr - 85386 85345 42 0 0 99 97 44 0.998 8.92 13.04 Intr - 85622 85497 126 2 0 8 105 164 0.921 15.06 13.03 Intr - 85912 85719 194 2 2 86 2 116 0.105 6.09 13.02 Intr - 86642 86610 33 0 0 -9 93 80 0.104 1.08 13.01 Init - 87540 87498 43 0 1 18 94 24 0.113 1.63 13.00 Prom - 89333 89294 40 -5.95 14.04 PlyA - 89525 89520 6 1.05 14.03 Term - 89850 89789 62 1 2 41 28 131 0.971 4.39 14.02 Intr - 91234 90752 483 2 0 63 83 539 0.998 47.97 14.01 Init - 93492 91320 2173 0 1 42 80 1437 0.921 128.84 14.00 Prom - 93662 93623 40 -8.75 15.04 PlyA - 93706 93701 6 1.05 15.03 Term - 95250 95144 107 1 2 28 48 74 0.385 -0.01 15.02 Intr - 95571 95306 266 2 2 49 77 85 0.322 4.73 15.01 Init - 98773 95631 3143 1 2 72 62 1211 0.530 113.12 15.00 Prom - 99372 99333 40 -9.75 16.00 Prom + 99834 99873 40 -8.65 16.01 Sngl + 101425 102054 630 0 0 43 42 469 0.991 38.73 16.02 PlyA + 102160 102165 6 -1.95Click here to view a PDF image of the predicted gene(s)
Click here for a PostScript image of the predicted gene(s)
Predicted peptide sequence(s): >Arabidopsis|GENSCAN_predicted_peptide_1|482_aa XRDQATRSSYCDKRARPLLEEGRQGRADELLETGELTSCLIRTGWRDTSEGRADELLETG ELTSCLRRTGWRDTSDGESPMAAPVVVAPAINLQHMDLPEGNEDVAAVLQTSTQRWLNAA EWSILYNNHNLFPESNSFTIPIEVEGLYRVDNRFNQDSNTWRNTSAPSTCKNAINFYNVR IATGLYRFDDDGDIVMDAEVQGGWVLCRKTVRNRGHCGTRLSHLYKPVFDRQFKMLKIHM MEDMIKDGEEKSLDTFVISRQLTTFTDLTVCLSTRKDRVTIKLKTRFPHQSSATYILDRF KTSSGSNQRVGSTSPEQNVREAAHFVLRRIFNLQYMGELQCHFEVYRNDELTVEELKRKN PRGVLISPGPGTPQDSGISLQTVLELGPLVPLFGVCMGLQCIGEAFGGRYHSLVIEKDTF PSDELEVTAWTEDGLVMAARHRKYKHIQGVQFHPESIITTEGKTIVRNFIKIVEKKDSEK LT >Arabidopsis|GENSCAN_predicted_peptide_2|839_aa MRLRHQGGQYLARGMEPVYWSNKVDSTHDLYDNYTLGSYQNNSYNILAHTGFGYDHEFQI YEINSNSWRIIDATLDFKLGYIGRGVSLKGKTYWIASGEEEKRLGKFLISFDYTTERFER LCLPNKYPCYATLALSVVREEKLSMLSQRGIKSKAEIWVTNKIGETQVVSWSMVLALDLQ PKRCIGESGSFLVDEKKRVVVCCDNIRDQGKTTVHIFGEDNKVKQSGRLQQTLAGSVEVK GKSLHSGKFSTVKLNPEIAGAGRFFEFRSRFIPASIEFAQESPLCTTLLKDELKIRTVEH LLSALEAKGVDNCRIQIESESSDDREVEVPAIGCQWFSWRPIHESSFAKDIASSRTFCVY EEVERMREAGLIKGGSLDNAIVCSAEHGWMNPPLRFDDEACRHKILDLIGDLSLVSRGGN GGLPVAHIVAYKNEEEEVRSSTEENENVRSSTEENEYVRSSTEAGRGEDPHEDSDNISES GAIHLQIMERFNQRSGSGKEEEEDTMDLSIKQVDLLNQVEISRVYHCDGLLLCVAKDNSR VVVWNPYLGQTRWIRPRTESNIGDSYALGYDINRNHKILRMVQTRNVSVYRYEIYDLRSN SWRVLEVTPNGEMDPNHPLYGVSVKGNTYFFAHEDSSSGEIDEDGDIIDLEDFLLCFDFT TETFGLRLPLPFHSTIDATVTLSCVRDQQLAVLYHNEGLHSDDRFTTVEFWVTTSIEPNS VSWSKFLTVDMRPLALTGVRFDNDMGATFFIDEDEKVAVVFDLDGYLSTESARYHTAFIS GKDGFFKPVTLGVAPNVGEPCPRTGHIPTTYRPPLVCSSTYLPSLVQVNQQRKRKERHV >Arabidopsis|GENSCAN_predicted_peptide_3|468_aa MDGQTRYFGRGRADELLETGELTSCLIRTGWRDTSEGRADELLETGELTSCLRRTGWRDT SDGESPMAAPVVVAPAINLQHMDLPEGNEDVAAVLQTSTQRWLNAAEWSILYNNHNLFPE SNSFTIPIEVEGLYRVDNRFNQDSNTWRNTSAPSTCKNAINFYNVRIATGLYRFDDDGDI VMDAEVQGGWVLCRKTVRNRGHCGTRLSHLYKPVFDRQFKMLKIHMMEDMIKDGEEKSLD TFVISRQLTTFTDLTVCLSTRKDRVTIKLKTRFPHQSSATYILDRFKTSSGSNQRVGSTS PEQNVREAAHFVLRRIFNLQYMGELQCHFEVYRNDELTVEELKRKNPRGVLISPGPGTPQ DSGISLQTVLELGPLVPLFGVCMGLQCIGEAFGGRYHSLVIEKDTFPSDELEVTAWTEDG LVMAARHRKYKHIQGVQFHPESIITTEGKTIVRNFIKIVEKKDSEKLT >Arabidopsis|GENSCAN_predicted_peptide_4|839_aa MRLRHQGGQYLARGMEPVYWSNKVDSTHDLYDNYTLGSYQNNSYNILAHTGFGYDHEFQI YEINSNSWRIIDATLDFKLGYIGRGVSLKGKTYWIASGEEEKRLGKFLISFDYTTERFER LCLPNKYPCYATLALSVVREEKLSMLSQRGIKSKAEIWVTNKIGETQVVSWSMVLALDLQ PKRCIGESGSFLVDEKKRVVVCCDNIRDQGKTTVHIFGEDNKVKQSGRLQQTLAGSVEVK GKSLHSGKFSTVKLNPEIAGAGRFFEFRSRFIPASIEFAQESPLCTTLLKDELKIRTVEH LLSALEAKGVDNCRIQIESESSDDREVEVPAIGCQWFSWRPIHESSFAKDIASSRTFCVY EEVERMREAGLIKGGSLDNAIVCSAEHGWMNPPLRFDDEACRHKILDLIGDLSLVSRGGN GGLPVAHIVAYKNEEEEVRSSTEENENVRSSTEENEYVRSSTEAGRGEDPHEDSDNISES GAIHLQIMERFNQRSGSGKEEEEDTMDLSIKQVDLLNQVEISRVYHCDGLLLCVAKDNSR VVVWNPYLGQTRWIRPRTESNIGDSYALGYDINRNHKILRMVQTRNVSVYRYEIYDLRSN SWRVLEVTPNGEMDPNHPLYGVSVKGNTYFFAHEDSSSGEIDEDGDIIDLEDFLLCFDFT TETFGLRLPLPFHSTIDATVTLSCVRDQQLAVLYHNEGLHSDDRFTTVEFWVTTSIEPNS VSWSKFLTVDMRPLALTGVRFDNDMGATFFIDEDEKVAVVFDLDGYLSTESARYHTAFIS GKDGFFKPVTLGVAPNVGEPCPRTGHIPTTYRPPLVCSSTYLPSLVQVNQQRKRKERHV >Arabidopsis|GENSCAN_predicted_peptide_5|400_aa MDGQTRYFGRGRADELLETGELTSCLIRTGWRDTSEGRADELLETGELTSCLRRTGWRDT SDGESPMAAPVVVAPAINLQHMDLPEGNEDVAAVLQTSTQRWLNAAEWSILYNNHNLFPE SNSFTIPIEVEGLYRVDNRFNQDSNTWRNTSAPSTCKNAINFYNVRIATGLYRFDDDGDI VMDAEVQGGWVLCRKTLLLILYPDCIGILCNTVFLSIWIKSGHEGKSAAFSVETLSRVRL DKSPDFINMSSPDSVHFSFYSKCLAKRNPYAMYLQSLRLGFCSLDMKGAISLLNDCKDVI PIAKLLYIALNRCAGNEVLDAFNTFKRENKCFTDVDIMAKTLIDHISDLEPKRFGSYAEL FHYEDYPDCWVIHEFYNEYNGERCSHCVYFFLFRDILLLS >Arabidopsis|GENSCAN_predicted_peptide_6|167_aa MEGSTSQEPSSQWDSNLPNRLFATDHYPRGRLNVYSRPDILSFIKKVFDGTEELDFILQS CFGPLFHLPVSRVATSGKVIHALLCRQLLTKKKYEFWTVFGGHPMRFSLLEFASVTGLPC GEFPDEYDPEDSPVYDDGKKSYWNELIGPDKTVTLGEILRTGHNREG >Arabidopsis|GENSCAN_predicted_peptide_7|248_aa MLKIHMMEDMIKDGEEKSLDTFVISRQLTTFTDLTVCLSTRKDRVTIKLKTRFPHQSSAT YILDRFKTSSGSNQRVGSTSPEQNVREAAHFVLRRIFNLQYMGELQCHFEVYRNDELTVE ELKRKNPRGVLISPGPGTPQDSGISLQTVLELGPLVPLFGVCMGLQCIGEAFGGRYHSLV IEKDTFPSDELEVTAWTEDGLVMAARHRKYKHIQGVQFHPESIITTEGKTIVRNFIKIVE KKDSEKLT >Arabidopsis|GENSCAN_predicted_peptide_8|839_aa MRLRHQGGQYLARGMEPVYWSNKVDSTHDLYDNYTLGSYQNNSYNILAHTGFGYDHEFQI YEINSNSWRIIDATLDFKLGYIGRGVSLKGKTYWIASGEEEKRLGKFLISFDYTTERFER LCLPNKYPCYATLALSVVREEKLSMLSQRGIKSKAEIWVTNKIGETQVVSWSMVLALDLQ PKRCIGESGSFLVDEKKRVVVCCDNIRDQGKTTVHIFGEDNKVKQSGRLQQTLAGSVEVK GKSLHSGKFSTVKLNPEIAGAGRFFEFRSRFIPASIEFAQESPLCTTLLKDELKIRTVEH LLSALEAKGVDNCRIQIESESSDDREVEVPAIGCQWFSWRPIHESSFAKDIASSRTFCVY EEVERMREAGLIKGGSLDNAIVCSAEHGWMNPPLRFDDEACRHKILDLIGDLSLVSRGGN GGLPVAHIVAYKNEEEEVRSSTEENENVRSSTEENEYVRSSTEAGRGEDPHEDSDNISES GAIHLQIMERFNQRSGSGKEEEEDTMDLSIKQVDLLNQVEISRVYHCDGLLLCVAKDNSR VVVWNPYLGQTRWIRPRTESNIGDSYALGYDINRNHKILRMVQTRNVSVYRYEIYDLRSN SWRVLEVTPNGEMDPNHPLYGVSVKGNTYFFAHEDSSSGEIDEDGDIIDLEDFLLCFDFT TETFGLRLPLPFHSTIDATVTLSCVRDQQLAVLYHNEGLHSDDRFTTVEFWVTTSIEPNS VSWSKFLTVDMRPLALTGVRFDNDMGATFFIDEDEKVAVVFDLDGYLSTESARYHTAFIS GKDGFFKPVTLGVAPNVGEPCPRTGHIPTTYRPPLVCSSTYLPSLVQVNQQRKRKERHV >Arabidopsis|GENSCAN_predicted_peptide_9|492_aa MHESHKYENGISREGRRIFLLMTSPRLRRIGRITNDDFFNEHKWKRDQATRSSYCDKRAR PLLEEGRQVIDEPADVRRAGWLLETCEESPMAAPVVVAPAINLQHMDLPEGNEDVAAVLQ TSTQRWLNAAEWSILYNNHNLFPESNSFTIPIEVEGLYRVDNRFNQDSNTWRNTSAPSTC KNAINFYNVRIATGLYRFDDDGDIVMDAEVQGGWVLCRKTVRNRGHCGTRLSHLYKPVFD RQFKMLKIHMMEDMIKDGEEKSLDTFVISRQLTTFTDLTVCLSTRKDRVTIKLKTRFPHQ SSATYILDRFKTSSGSNQRVGSTSPEQNVREAAHFVLRRIFNLQYMGELQCHFEVYRNDE LTVEELKRKNPRGVLISPGPGTPQDSGISLQTVLELGPLVPLFGVCMGLQCIGEAFGGRY HSLVIEKDTFPSDELEVTAWTEDGLVMAARHRKYKHIQGVQFHPESIITTEGKTIVRNFI KIVEKKDSEKLT >Arabidopsis|GENSCAN_predicted_peptide_10|839_aa MRLRHQGGQYLARGMEPVYWSNKVDSTHDLYDNYTLGSYQNNSYNILAHTGFGYDHEFQI YEINSNSWRIIDATLDFKLGYIGRGVSLKGKTYWIASGEEEKRLGKFLISFDYTTERFER LCLPNKYPCYATLALSVVREEKLSMLSQRGIKSKAEIWVTNKIGETQVVSWSMVLALDLQ PKRCIGESGSFLVDEKKRVVVCCDNIRDQGKTTVHIFGEDNKVKQSGRLQQTLAGSVEVK GKSLHSGKFSTVKLNPEIAGAGRFFEFRSRFIPASIEFAQESPLCTTLLKDELKIRTVEH LLSALEAKGVDNCRIQIESESSDDREVEVPAIGCQWFSWRPIHESSFAKDIASSRTFCVY EEVERMREAGLIKGGSLDNAIVCSAEHGWMNPPLRFDDEACRHKILDLIGDLSLVSRGGN GGLPVAHIVAYKNEEEEVRSSTEENENVRSSTEENEYVRSSTEAGRGEDPHEDSDNISES GAIHLQIMERFNQRSGSGKEEEEDTMDLSIKQVDLLNQVEISRVYHCDGLLLCVAKDNSR VVVWNPYLGQTRWIRPRTESNIGDSYALGYDINRNHKILRMVQTRNVSVYRYEIYDLRSN SWRVLEVTPNGEMDPNHPLYGVSVKGNTYFFAHEDSSSGEIDEDGDIIDLEDFLLCFDFT TETFGLRLPLPFHSTIDATVTLSCVRDQQLAVLYHNEGLHSDDRFTTVEFWVTTSIEPNS VSWSKFLTVDMRPLALTGVRFDNDMGATFFIDEDEKVAVVFDLDGYLSTESARYHTAFIS GKDGFFKPVTLGVAPNVGEPCPRTGHIPTTYRPPLVCSSTYLPSLVQVNQQRKRKERHV >Arabidopsis|GENSCAN_predicted_peptide_11|555_aa MSDVSGDGDLSATVTEHEVTPQPPVSSATYPSLTVSASYKESSGGKSSSKRRPIRPSFDA AADNEFITLLHGSDPVKVELNRLENEVRAYRHMISTLLQNLEIKKINEEKKASMAAQFAA EATLRRVHAAQKDDDMPPIEAILAPLEAELKLARSEIGKLQEDNRALDRLTKSKEAALLE AERTVEAAMAKAAMVDDLQNKNQELMKQIEICQEENKILDRMHRQKVAEVEKLTQTVREL EEAVLAGGAAANAEERKTLDRELARAKVTANRVATVVANEWKDGNDKVMPVKQWLEERRF LQGEMQQLRDKLAITDRAAKSEAQLKEKFQLRLKVLEETLRGTSSSATRNTPEARSMSNG PSRRQSLGGAENLQKFTSNGALSKKAPASQMRHSLSINSTSVLKNAKGTSKSFDGGTRSV DRGKALLNGPGNYSFNKATDDSKEAESGNGWKENSEEKPQSEDPEAATEDSVPGVLYDLL QKEVVSLRKASNEKDQSLKDKDDAIEMLAKKVETLTKAMEVEAKEDETGSSCNGKRSCCN ACGERSGRQSQKVFK >Arabidopsis|GENSCAN_predicted_peptide_12|255_aa MHERLMKLVKVDFLLMLNPNQLMYYTVVVVNQNLVYVKQYIFETTAYPREHEQLKKLREA TVLKYGNLSEMEVPVDEGHFLSMLLKIMNAKKTIELGVFTGYSLLTTALALPHDGHVTGI DIDKEAYEMGLEFIKNAGVHHKINFIHSDCLQALDNMLSENPKPEFDFAFVDADKPNYAN MHERLMKLVKVGGVIAFDNTLWSGFVAEKEENVPVHMRVNRKAFLDLNKRLAADPHVEVS QVSIGDGVTLCRRLV >Arabidopsis|GENSCAN_predicted_peptide_13|1760_aa MHDGTLVEDFYFSQGGRGGEEGEDDWFRKGRRVKDIHTYTRKGFKMSLPLLECKYVTEEF VREGKNGNYGTKLPSSVPMLRFLYELSWILVRGELPIQSCKAVLEGVEFLDKPSREELAS CFADVVTQIAQDLTMSGDQRSRLIKLAKWLVESQTVPQRLFQERCEEEFLWEADMVKIKA QDLKGKEVRLNTRLLYQQTKFNLLREESEGYAKLSLIGHFDLDPNRVFDISHASQILGFK FQYYQRLEVNSPVPVGLYKLTALLVKEEFINLESIYAHLLPKDEEVFEDYNVSSAKRFEE ANKIGKINLAATGKDLMEDEKQGDVTVDLFAALDMESEAVTERLPELENNQTLGLLNGFL SVDDWYHANILFERLAPLNPVAHDQICSGLFRLIEKSITHSYRIARQTRFQSSSSASTVK LTPTANTTANRTYLDLPKEVFQMLVTVGPYLYRNTQLLQKICRVLRAYYLSALDLVRDGS NQEGSAYEVSRGHLKEVRLRVEEALGTCLLPSLQLVPANPAVGHEIWEARYRLYGEWEKD DEQNPLLLAARQVAKIHPPDPHETTLVCQLDTRRILKRLAKENLKQLGRMVAKLAHANPM TVLRTIVNQIEAYRDMIAPVVDAFKYLTQLEYDILEYVVIERLAQSGRDKLKDDGINLSD WLQSLASFWGHLCKKYPSMELRGLFQYLVNQLKRGQGIELVLLQELVQQMANVQYTENLT EDQLDAMAGSETLRYHATSFGMMRNNKALIKSSNRLRDSLLPNDEPKLAIPLLLLIAQHR SLVVVNADAPYIKMVTEQFDRCHGILLQYVDFLSSAVSPTTAYARLVAFLVFRPVMRLFK CRRNGDVSWPLDSGESMDADSEISESESSMILDVGTSEKAVTWSDVLDTVRTMLPSKAWN SLSPDLYATFWGLTLYDLHVPRNRYESEISKQHTALKTLEEVADNSSSAITKRKKEKERI QESLDRLTGELKKHEEHVASVRRRLSREKDTWLSSCPDTLKINMEFLQRCIFPRCTFSMA DSVYCAMFVNMLHSLGTPFFNTVNHIDVLICKTLQPMICCCTEYEVGRLGRFLFETLKIA YHWKSKESVYEHECGNMPGFAVYYRYPNSQRVTFGQFVKVAKIKNDEREDLKVLATGVGA ALSARKPHWVTDEEFSMGFLELKAPPVHTPKHASSQNGLLVGVSQGEPTGERATVNQQPE SGGLGKDQMLKTKPLDGRTESIPSKSDQGHLKSKGGNPLDSQPSISKKSMEQKETDETPR ISDENPVKPASKYSEAELKASSKRGASVNKSAKQDFGKDDGKSGKAIGRTSTADKDLNYL ESRQSGLTKALSSTAANGSIATGSSKAVKDDGAEALDAQKQSSRTVHSPRHEIVTSVRSS DRLQKRANAVEDSERISKRRKGDAEHKEHDSEPRSSDRDRSVEARLDLNKTVTDDQSTHR DQDRSKDKGYERQDRDHRERVDRSDKPRGDDVEKARDKSLERHGRERSVEKGLDKGTTRS YDRNKDERNKDDRSKLRHSEASLEKSHPDDHFHSQGLPPPPPLPPNIIPHSMAAKEDLER RAGGARHSQRLSPRHEEREKRRSEENLSVSVDDAKRRRDDDIRDRKRDDRETITVKGEER EREREREREREKSLPLKEDFEASKRRKLKREQQVPSAEPGEYSPMPHHSSLSTSMGPSSY EGRERKSSSMIQHGGYLEEPSIRLLGKEASSKMARRDPDPIAKSKSKNSNFLDIALESMT VNGKTTRGEQSGSGEIGSRE >Arabidopsis|GENSCAN_predicted_peptide_14|905_aa MIAKNFLLLLCFIALVNVESSPDEAVMIALRDSLKLSGNPNWSGSDPCKWSMFIKCDASN RVTAIQIGDRGISGKLPPDLGKLTSLTKFEVMRNRLTGPIPSLAGLKSLVTVYANDNDFT SVPEDFFSGLSSLQHVSLDNNPFDSWVIPPSLENATSLVDFSAVNCNLSGKIPDYLFEGK DFSSLTTLKLSYNSLVCEFPMNFSDSRVQVLMLNGQKGREKLHGSISFLQKMTSLTNVTL QGNSFSGPLPDFSGLVSLKSFNVRENQLSGLVPSSLFELQSLSDVALGNNLLQGPTPNFT APDIKPDLNGLNSFCLDTPGTSCDPRVNTLLSIVEAFGYPVNFAEKWKGNDPCSGWVGIT CTGTDITVINFKNLGLNGTISPRFADFASLRVINLSQNNLNGTIPQELAKLSNLKTLDVS KNRLCGEVPRFNTTIVNTTGNFEDCPNGNAGKKASSNAGKIVGSVIGILLALLLIGVAIF FLVKKKMQYHKMHPQQQSSDQDAFKITIENLCTGVSESGFSGNDAHLGEAGNIVISIQVL RDATYNFDEKNILGRGGFGIVYKGELHDGTKIAVKRMESSIISGKGLDEFKSEIAVLTRV RHRNLVVLHGYCLEGNERLLVYQYMPQGTLSRHIFYWKEEGLRPLEWTRRLIIALDVARG VEYLHTLAHQSFIHRDLKPSNILLGDDMHAKVADFGLVRLAPEGTQSIETKIAGTFGYLA PEYAVTGRVTTKVDVYSFGVILMELLTGRKALDVARSEEEVHLATWFRRMFINKGSFPKA IDEAMEVNEETLRSINIVAELANQCSSREPRDRPDMNHVVNVLVSLVVQWKPTERSSDSE DIYGIDYDTPLPQLILDSCFFGDNTLTSILRDLPSLKAHSNQDKGDCRLLNDNDTGEPNR DQARS >Arabidopsis|GENSCAN_predicted_peptide_15|1171_aa MHSRDDLVDIQSWLEYDQVYTVEPVGKCGGLALLWKSSVQVDLKFVDKNLMDAQVQFGAV NFCVSCVYGDPDRSKRSQAWERISRIGVGRRDKWCMFGDFNDILHNGEKNGGPRRSDLDC KAFNEMIKGCDLVEMPAHGNGFTWAGRRGDHWIQCRLDRAFGNKEWFCFFPVSNQTFLDF RGSDHRPVLIKLMSSQDSYRGQFRFDKRFLFKEDVKEAIIRTWSRGKHGTNISVADRLRA CRKSLSSWKKQNNLNSLDKINQLEAALEKEQSLVWPIFQRVSVLKKDLAKAYREEEAYWK QKSRQKWLRSGNRNSKYFHAAVKQNRQRKRIEKLKDVNGNMQTSEAAKGEVAAAYFGNLF KSSNPSGFTDWFSGLVPRVSEVMNESLVGEVSAQEIKEAVFSIKPASAPGPDGMSALFFQ HYWSTVGNQVTSEVKKFFADGIMPAEWNYTHLCLIPKTQHPTEMVDLRPISLCSVLYKII SKIMAKRLQPWLPEIVSDTQSAFVSERLITDNILVAHELVHSLKVHPRISSEFMAVKSDM SKAYDRVEWSYLRSLLLSLGFHLKWVNWIMVCVSSVTYSVLINDCPFGLIILQRGLRQGD PLSPFLFVLCTEGLTHLLNKAQWEGALEGIQFSENGPMVHHLLFADDSLFLCKASREQSL VLQKILKVYGNATGQTINLNKSSITFGEKVDEQLKGTIRTCLGIFTEGGAGTYLGLPECF SGSKVDMLHYLKDRLKEKLDVWFTRCLSQGGKEVLLKSVALAMPVFAMSCFKLPITTCEN LESAMASFWWDSCDHSRKIHWQSWERLCLPKDSGGLGFRDIQSFNQALLAKQAWRLLHFP DCLLSRLLKSRYFDATDFLDAALSQRPSFGWRSILFGRELLSKGLQKRVGDGASLFVWID PWIDDNGFRAPWRKNLIYDVTLKVKALLNPRTGFWDEEVLHDLFLPEDILRIKAIKPVIS QADFFVWKLNKSGDFSVKSAYWLAYQTKSQNLRSEVSMQPSTLGLKTQVWNLQTDPKIKI FLWKVLSGILPVAENLNGRGMSLIPHVRQIWALSDYPFPPDGFSNGSIYSNINHLLENKD NKEWPINLRKIFPWILWRIWKNRNSFIFEGISYPATDTVIKIRDDVVEWFEAQCLDVYRR IYGLNLQRIGLNVTLGLLGKKETILLEGLGF >Arabidopsis|GENSCAN_predicted_peptide_16|209_aa MTESDDASRETPASRGGEASSNQDLSKPESNHVSLDLKLNDTFNDDTKSTKCEANPRVFS CNYCRRKFYSSQALGGHQNAHKRERTMAKRAMHMGRMFGHHHRPYTYTSSSLGMQAHSGL LHHTLSQPQPLVSRFHHQGYFGNTVPLFFDYDDGGSDFFWPGSFRQVVEEAEAPVVVVAS TESGLDLNSVAANGGVDNNSSKPDLTLRL Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA) S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a forward strand codon ending at x has frame x mod 3) Ph : net phase of exon (exon length modulo 3) I/Ac : initiation signal or 3' splice site score (tenth bit units) Do/T : 5' splice site or termination signal score (tenth bit units) CodRg : coding region score (tenth bit units) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores) Comments The SCORE of a predicted feature (e.g., exon or splice site) is a log-odds measure of the quality of the feature based on local sequence properties. For example, a predicted 5' splice site with score > 100 is strong; 50-100 is moderate; 0-50 is weak; and below 0 is poor (more than likely not a real donor site). The PROBABILITY of a predicted exon is the estimated probability under GENSCAN's model of genomic sequence structure that the exon is correct. This probability depends in general on global as well as local sequence properties, e.g., it depends on how well the exon fits with neighboring exons. It has been shown that predicted exons with higher probabilities are more likely to be correct than those with lower probabilities.